Goto

Collaborating Authors

 computer system


NN4SysBench: Characterizing Neural Network Verification for Computer Systems

Neural Information Processing Systems

We present NN4SysBench, a benchmark suite for neural network verification that is composed of applications from the domain of computer systems. We call these neural networks for computer systems or NN4Sys. NN4Sys is booming: there are many proposals for using neural networks in computer systems--for example, databases, OSes, and networked systems--many of which are safety critical. Neural network verification is a technique to formally verify whether neural networks satisfy safety properties. We however observe that NN4Sys has some unique characteristics that today's verification tools overlook and have limited support. Therefore, this benchmark suite aims at bridging the gap between NN4Sys and the verification by using impactful NN4Sys applications as benchmarks to illustrate computer systems' unique challenges. We also build a compatible version of NN4SysBench, so that today's verifiers can also work on these benchmarks with approximately the same verification difficulties.




NN4SysBench: Characterizing Neural Network Verification for Computer Systems

Neural Information Processing Systems

We present NN4SysBench, a benchmark suite for neural network verification that is composed of applications from the domain of computer systems. We call these neural networks for computer systems or NN4Sys . NN4Sys is booming: there are many proposals for using neural networks in computer systems--for example, databases, OSes, and networked systems--many of which are safety-critical. Neural network verification is a technique to formally verify whether neural networks satisfy safety properties.


JLR suppliers 'face bankruptcy' due to hack crisis

BBC News

JLR suppliers'face bankruptcy' due to hack crisis The past two weeks have been dreadful for Jaguar Land Rover (JLR), and the crisis at the car maker shows no sign of coming to an end. A cyber attack, which first came to light on 1 September, forced the manufacturer to shut down its computer systems and close production lines worldwide. Its factories in Solihull, Halewood, and Wolverhampton are expected to remain idle until at least Wednesday, as the company continues to assess the damage. JLR is thought to have lost at least £50m so far as a result of the stoppage. But experts say the most serious damage is being done to its network of suppliers, many of whom are small and medium sized businesses.


Towards Safeguarding LLM Fine-tuning APIs against Cipher Attacks

Youstra, Jack, Mahfoud, Mohammed, Yan, Yang, Sleight, Henry, Perez, Ethan, Sharma, Mrinank

arXiv.org Artificial Intelligence

Large language model fine-tuning APIs enable widespread model customization, yet pose significant safety risks. Recent work shows that adversaries can exploit access to these APIs to bypass model safety mechanisms by encoding harmful content in seemingly harmless fine-tuning data, evading both human monitoring and standard content filters. We formalize the fine-tuning API defense problem, and introduce the Cipher Fine-tuning Robustness benchmark (CIFR), a benchmark for evaluating defense strategies' ability to retain model safety in the face of cipher-enabled attackers while achieving the desired level of fine-tuning functionality. We include diverse cipher encodings and families, with some kept exclusively in the test set to evaluate for generalization across unseen ciphers and cipher families. We then evaluate different defenses on the benchmark and train probe monitors on model internal activations from multiple fine-tunes. We show that probe monitors achieve over 99% detection accuracy, generalize to unseen cipher variants and families, and compare favorably to state-of-the-art monitoring approaches. We open-source CIFR and the code to reproduce our experiments to facilitate further research in this critical area. Code and data are available online https://github.com/JackYoustra/safe-finetuning-api



NN4SysBench: Characterizing Neural Network Verification for Computer Systems

Neural Information Processing Systems

We present NN4SysBench, a benchmark suite for neural network verification that is composed of applications from the domain of computer systems. We call these neural networks for computer systems or NN4Sys. NN4Sys is booming: there are many proposals for using neural networks in computer systems--for example, databases, OSes, and networked systems--many of which are safety critical. Neural network verification is a technique to formally verify whether neural networks satisfy safety properties. We however observe that NN4Sys has some unique characteristics that today's verification tools overlook and have limited support.


Mitigating Safety Fallback in Editing-based Backdoor Injection on LLMs

Jiang, Houcheng, Zhao, Zetong, Fang, Junfeng, Ma, Haokai, Wang, Ruipeng, Deng, Yang, Wang, Xiang, He, Xiangnan

arXiv.org Artificial Intelligence

Large language models (LLMs) have shown strong performance across natural language tasks, but remain vulnerable to backdoor attacks. Recent model editing-based approaches enable efficient backdoor injection by directly modifying parameters to map specific triggers to attacker-desired responses. However, these methods often suffer from safety fallback, where the model initially responds affirmatively but later reverts to refusals due to safety alignment. In this work, we propose DualEdit, a dual-objective model editing framework that jointly promotes affirmative outputs and suppresses refusal responses. To address two key challenges -- balancing the trade-off between affirmative promotion and refusal suppression, and handling the diversity of refusal expressions -- DualEdit introduces two complementary techniques. (1) Dynamic loss weighting calibrates the objective scale based on the pre-edited model to stabilize optimization. (2) Refusal value anchoring compresses the suppression target space by clustering representative refusal value vectors, reducing optimization conflict from overly diverse token sets. Experiments on safety-aligned LLMs show that DualEdit improves attack success by 9.98\% and reduces safety fallback rate by 10.88\% over baselines.


A new law in this state bans automated insurance claim denials

FOX News

'Ask Dr. Drew' host Dr. Drew Pinsky breaks down key takeaways from the MAHA Commission's chronic disease report on'The Ingraham Angle.' As some health insurance companies have come under fire for allegedly using computer systems to shoot down claims, an Arizona law will soon make the practice illegal in the Grand Canyon State. Republican Arizona House Majority Whip Rep. Julie Willoughby sponsored the legislation, and it was recently signed into law by Democratic Gov. Katie Hobbs. House Bill 2175 requires a physician licensed in the state to conduct an "individual review" and use "independent medical judgment" to determine whether the claim should actually be denied. It also required a similar review of "a direct denial of a prior authorization of a service" that a provider asked for and "involves medical necessity."